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Abstract. We prove pointwise convergence, as N — > oo, for the multiple ergodic averages 
Jf X/n=i f(T n x) ■g(S an x), where T and S are commuting measure preserving transformations, 
and o„ is a random version of the sequence [n c ] for some appropriate c > 1. We also prove 
similar mean convergence results for averages of the form ± £^ =1 f(T a "x) ■ g(S an x), as well 
as pointwise results when T and S are powers of the same transformation. The deterministic 
versions of these results, where one replaces a n with [n c ], remain open, and we hope that our 
method will indicate a fruitful way to approach these problems as well. 



1.1. Background and new results. Recent advances in ergodic theory have sparked an 
outburst of activity in the study of the limiting behavior of multiple ergodic averages. Despite 
the various successes in proving mean convergence results, progress towards the corresponding 
pointwise convergence problems has been very scarce. For instance, we still do not know 
whether the averages 



converge pointwise when T and S are two commuting measure preserving transformations 
acting on the same probability space and / and g are bounded measurable functions. Mean 
convergence for such averages was shown in [11] and was recently generalized to an arbitrary 
number of commuting transformations in [27] . On the other hand, the situation with pointwise 
convergence is much less satisfactory. Partial results that deal with special classes of trans- 
formations can be found in [21 EJ [22j [231 H]- Without imposing any strictures on the possible 
classes of the measure preserving transformations considered, pointwise convergence is only 
known when T and S are powers of the same transformation [8] (see also |12] for an alternate 
proof), a result that has not been improved for twenty years. 



2000 Mathematics Subject Classification. Primary: 37A30; Secondary: 28D05, 05D10, 11B25. 

Key words and phrases. Ergodic averages, mean convergence, pointwise convergence, multiple recurrence, 
random sequences, commuting transformations. 

The first author was partially supported by Marie Curie IRG 248008 and the third author by NSF grant 
DMS-0801316. 



1. Introduction 





n=l 



2 



N. FRANTZIKINAKIS, E. LESIGNE, AND M. WIERDL 



More generally, for fixed a,/3 £ [l,+oo), one would like to know whether the averages 

i N 

(2) i^/(Tl"\). 9 (5["\) 

n=X 

converge pointwise. Mean convergence for these and related averages has been extensively 
studied, partly because of various links to questions in combinatorics. In particular, mean 
convergence is known when T = S and a, (3 are positive integers [181125). or positive non-integers 
|15j . Furthermore, for general commuting transformations T and S, mean convergence is known 
when a, f3 are different positive integers [10) . Regarding pointwise convergence, again, the 
situation is much less satisfactory. When a, (3 are integers, some partial results for special classes 
of transformations can be found in |13j and [24]. Furthermore, pointwise convergence is known 
for averages of the form ± £n=i f(T^x) with no restrictions on the transformation T ([7] 
for integers a, and [2U] or [B] for non-integers a). But for general commuting transformations 
T and S, no pointwise convergence result is known, not even when T = S and a ^ /3. 

The main goal of this article is to make some progress related to the problem of pointwise 
convergence of the averages ([2]) by considering randomized versions of fractional powers of n, 
in place of the deterministic ones, for various suitably chosen exponents a and j3. In our first 
result, we study a variation of the averages ([2]) where the iterates of T are deterministic and 
the iterates of S are random. More precisely, we let a n be a random version of the sequence 
[n"\ where /? £ (1, 14/13) is arbitrary. We prove that almost surely (the set of probability 1 is 
universal) the averages 

1 N 

(3) -Y,f(T n x)-9(S a "x) 

n=l 

converge pointwise, and we determine the limit explicitly. This is the first pointwise conver- 
gence result for multiple ergodic averages of the form J2n=i f{T an x) • g(S bn x), where a„, b n 
are strictly increasing sequences and T, S are general commuting measure preserving transfor- 
mations. In fact, even for mean convergence the result is new, and this is an instance where 
convergence of multiple ergodic averages involving sparse iterates is obtained without the use 
of rather deep ergodic structure theorems and equidistribution results on nilmanifolds. 

In our second result, we study a randomized version of the averages ([2]) when a = f3. In 
this case, we let a n be a random version of the sequence [n a ] where a £ (1,2) is arbitrary, and 
prove that almost surely (the set of probability 1 is universal) the averages 

1 - 

(4) _Y,f(T an x)-g(S a "x) 

n=l 

converge in the mean, and conditionally to the pointwise convergence of the averages (pQ), they 
also converge pointwise. Even for mean convergence, this gives the first examples of sparse 
sequences of integers a n for which the averages (|4|) converge for general commuting measure 
preserving transformations T and S. 

Because our convergence results come with explicit limit formulas, we can easily deduce 
some related multiple recurrence results. Using the correspondence principle of Furstenberg, 
these results translate to statements in combinatorics about configurations that can be found 
in every subset of the integers, or the integer lattice, with positive upper density. 
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Let us also remark that convergence of the averages (pQ) for not necessarily commuting trans- 
formations is known to fail in general- We prove that this is also the case for the averages ([2]), 
©, and ©. 

We state the exact results in the next section, where we also give precise definitions of the 
concepts used throughout the paper. 



1.2. Precise statements of new results. 



1.2.1. Our setup. We work with random sequences of integers that are constructed by selecting 
a positive integer n to be a member of our sequence with probability <r n G [0, 1]. More precisely, 
let (QjT, P) be a probability space, and let (X n ) n( zfq be a sequence of independent random 
variables with 

P(X n = l) = a n and P(X n = 0) = 1 - a n . 

In the present article we always assume that a n = n~ a for some a G (0,1). The random 
sequence (a n (w)) ng N is constructed by taking the positive integers n for which X n (uS) = 1 in 
increasing order. Equivalently, a n {uj) is the smallest k G N such that Xi(u) + • • • + Xk{u) = n. 
We record the identity 

(5) X 1 (u>) + --- + X an(u) (u)=n 

for future use. 

The sequence (a n (o;)) rag N is what we called random version of the sequence n l ^ l ~ a ^ in 
the previous subsection. Indeed, using a variation of the strong law of large numbers (see 
Lemma below), we have that almost surely -=^r Ylk=i Xk{u) converges to 1. Using the 

implied estimate for a n {uj) in place of TV, where n is suitably large, and ([5]), we deduce that 
almost surely a n {uj) /n 1 ^ 1 ^^ converges to a non-zero constant. 

1.2.2. Different iterates. In our first result we study a randomized version of the averages ([2]) 
when a = 1. 

Theorem 1.1. With the notation of Section \1.2.1l let a n = n~ a for some a G (0, 1/14). Then 
almost surely the following holds: For every probability space (X,X,n), commuting measure 
preserving transformations T,S: X — > X, and functions f,g£ L°°(fi), for almost every x G X 
we have 



(6) lim 1 V f{T n x) ■ g(S a ^x) = f{x) ■ g(x) 

N-^oo TV 

n=l 

where f := lim^oo jj En=i T n f = E(f\l(T)), g := lim^oo £ EjLi S n g = E(g\X(S))E 



1 See example 7.1 in 0, or let T,S:T—> T, given by Tx = 2x, Sx = 2x + a, and f(x) 
where a € [0, 1] is chosen so that the averages i? Yl n =i e 27 ™' 2 a diverge. 

2 If (X, X,fi) is a probability space, / € L°°(fi), and y is a sub-cr-algebra of X, we denote by E(/|^) the 
conditional expectation of / given 3^- If T : X — > X is a measure preserving transformation, by T(T) we denote 
the sub-cr-algebra of sets that are left invariant by T. 



= e~ 2,T ' x ,g{x) = e 2 ™, 
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We remark that the conclusion of Theorem 1 1 . 1 1 can be easily extended to all functions / G L p , 
g G L q , where p G [1, +oo] and q G (1, +oo] satisfy 1/p + 1/g < lH 

Combining the limit formula of Theorem 11.11 with the estimate (see Lemma 1.6 in [9]) 

/•E(/|^)-E(/|Af 2 ) dfi>( [ fdfi) 3 



that holds for every non- negative function / £ L°°(/i) and sub-a-algebras A4 and of A', we 
deduce the following: 

Corollary 1.2. Wzi/i i/ie assumptions of Theorem \l.ll we get almost surely, that for every 
A G X we have 

1 N 

lim - Win r~ n A n s- an(w) A) > ^{Af. 

N—too N z — 4 
n=l 

The upper density d(E) of a set .E7 C Z 2 is defined by d(E) — lim sup^y^^ ■ jjjjy^pj • ■ Com- 
bining the previous multiple recurrence result with a multidimensional version of Furstenberg's 
correspondence principle QTi . we deduce the following: 

Corollary 1.3. With the notation of Section \1.2.1\ let a n = n~ a for some a G (0, 1/14). Then 
almost surely, for every vi, V2 G Z 2 and E cZ 2 we have 

1 - 

liminf — V d(En - nvi) D (E - aJu)v 2 )) > (d{E)f . 

n=l 

We remark that in the previous statement we could have used the upper Banach density 
d* in place of the upper density d. This is defined by d*(E) = hmsupij^^ , where |/| 

denotes the area of a rectangle I and the lim sup is taken over all rectangles of Z 2 with side 
lengths that increase to infinity. The same holds for the statement of Corollary 11.61 below. 

1.2.3. Same iterates. In our next result we study a randomized version of the averages ([T]). By 
Tf we denote the composition / o T. 

Theorem 1.4. With the notation of Section \ 1.2. 11 let a n = n~ a for some a G (0, 1/2). Then 
almost surely the following holds: For every probability space (X,X,fJ,), commuting measure 
preserving transformations T, S: X — > X, and functions f,g G L°°(/i), the averages 



(7) J_ y<»n(w) j . gOn(u)g 

n=l 

converge in L 2 (/i) and their limit equals the l?-limit of the averages jf^2n=i^ n f ' S n g (this 
exists by [H]/- Furthermore, if T and S are powers of the same transformation, then the 
averages ([7]) converge pointwise. 



^To see this, one uses a standard approximation argument and the fact that the averages j, S^Li T 7 ™/ 
converge pointwise for / G L p when p G [1,+co], and the same holds for the averages -4 X)^=i S an ^ f for 
/ G L q when q G (1, +00] (see for example exercise 3 on page 78 of [26|). 
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Our argument actually shows that the averages ([7]) converge pointwise if and only if the 
averages ([!]) convergence pointwise. Furthermore, using our method, one can get similar con- 
vergence results for other random multiple ergodic averages. For instance, our method can be 
modified and combined with the results from [27] and [10] to show that for every t G N, if 
a n = n~ a and a is small enough (in fact any a G (0, 2 _£ ) works), then almost surely the aver- 

a £ es F -^n=i J i Jl"' 1 i h and 2^ n =i 1 1 /i" J 2 h ' ' ' 1 t h converge m 
the mean. 

Combining Theorem 11.41 with the multiple recurrence result of Furstenberg and Katznelson 
|17j . we deduce the following: 

Corollary 1.5. With the assumptions of Theorem \1.4\ we get almost surely, that if A G X has 
positive measure, then 

1 N 

lim — VuU nr^Mins^^i) > o. 

n=l 

Combining the previous multiple recurrence result with the correspondence principle of 
Furstenberg [IB] , we deduce the following: 

Corollary 1.6. PPiti/t i/ie notation of Section \1.2.1\, let a n = n~ a for some a G (0, 1/2). T/ien 
almost surely, for every vi, V2 G Z 2 ; and every E C Z 2 mi/i d(i?) > 0, we /lave 

1 - 

liminf — > d(J5 D (E + a n (w)vi) n (J5 + a n (u)v 2 )) > 0. 

n=l 

I. 2.4. Non-recurrence and non-convergence. One may wonder whether the assumption that the 
transformations T and £ commute can be removed from the statements of Theorems 11.11 and 

II. 41 and the related corollaries. It can definitely be weakened; probably assuming that the group 
generated by T and S is nilpotent suffices, see for example [1] where mean convergence of the 
averages ([T|) is shown under such an assumption. On the other hand, constructions of Berend 
(Ex 7.1 in [2]) and Furstenberg (page 40 in [16]) show that Theorem 11.41 and Corollary 11.51 
are false if the assumption that the transformations T and S commute is completely removed. 
Next, we state a rather general result which implies that one has similar obstructions when 
dealing with Theorem 11.11 and Corollary 11.21 

Given a probability space (X, X, /j,), we say that a measure preserving transformation T: X — > 
X is Bernoulli if the measure preserving system (X, X, /j,, T) is isomorphic to a Bernoulli shift 
on finitely many symbols. 

Theorem 1.7. Let a,b: N — > Z \ {0} be two injective sequences. Then there exist a probability 
space (X,X,n) and measure preserving transformations T,S: X — > X, both of them Bernoulli, 
such that 

• for some f,g G L°°(^) the averages 

En=i/ r(n) / • sHn) 9 d V diverge, and 

• for some A G X with fj,(A) > we have fj,(T~ a ^A n S~ b ^A) = for every n G N. 

One can use a variation of our argument to extend Theorem 11.71 to sequences of bounded 
multiplicity, meaning sequences (c(n)) that satisfy sup mGran g e ^ c ^ #{n G N | c(n) = m} < +oo. 
On the other hand, Theorem 11.71 cannot be extended to all sequences that take any given 
integer value a finite number of times. For instance, it is not hard to show that the pair of 
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sequences o(n) = [y/n\, b{n) = n, is good for (multiple) recurrence and mean convergence (see 
the proof of Theorem 2.7 in [15J). 

1.2.5. Further directions. The restrictions on the range of the eligible parameter a in Theo- 
rem II. H Theorem 11.41 an d the related corollaries, appears to be far from best possible In 
fact, any a < 1 is expected to work, but it seems that new techniques are needed to prove this. 
This larger range of parameters is known to work for pointwise convergence of the averages 
~k J2n=i f{T a "^x) (see [5] for mean convergence, [7] for pointwise, and [26] for a survey of 
related results). Furthermore, when a n = a E (0, 1) for every n E N, it is not clear whether the 
conclusion of Theorem 11.11 related to pointwise convergence holds (see Theorem 4 in [21] for a 
related negative pointwise convergence result). 

Regarding Theorem 11.11 it seems very likely that similar convergence results hold when 
the iterates of the transformation T are given by other "good" deterministic sequences, like 
polynomial sequences. Our argument does not give such an extension because it relies crucially 
on the linearity of the iterates of T. Furthermore, it seems likely that similar convergence results 
hold when the iterates of T and S are both given by random versions of different fractional 
powers, chosen independently. Again our present argument does not seem to apply to this case. 

1.3. General conventions and notation. We use the symbol <C when some expression is 
majorized by a constant multiple of some other expression. If this constant depends on the 
variables ki,...,kg, we write ^.ki,...,kf We say that a n ~ b n if a n /b n converges to a non- 
zero constant. We denote by ojv(l) a quantity that converges to zero when N — > oo and 
all other parameters are fixed. We say that two sequences are asymptotically equal whenever 
convergence of one implies convergence of the other and both limits coincide. If (O, J 7 , P) is 
a probability space, and X is a random variable, we set ¥. W (X) := J X dF. We say that 
a property holds almost surely if it holds outside of a set with probability zero. We often 
suppress writing the variable x when we refer to functions and the variable u when we refer to 
random variables or random sequences. Lastly, the following notation will be used throughout 
the article: N := {1,2, . . .}, Tf := foT, e(t) :=e 2wit . 

2. Convergence for independent random iterates 

In this section we prove Theorem 11.11 Throughout, we use the notation introduced in 
Section 11.2.11 and we also let 

N 

Y n :=X n -a n , Wn ■= o~ n - 

71=1 

We remark that if a n = n~ a for some a E (0, 1), then Wn ~ iV 1_cl . 



Any improvement in the range of the eligible parameter a in the statement of Proposition 12.11 or Proposi- 
tion 13.11 would give corresponding improvements in the statement of Theorem 11.11 and Theorem 11.41 and the 
related corollaries. 



RANDOM SEQUENCES AND POINTWISE CONVERGENCE OF MULTIPLE ERGODIC AVERAGES 



7 



2.1. Strategy of the proof. Roughly speaking, in order to prove Theorem II .11 we go through 
the following successive comparisons: 



1 N 



a n (w) , 



n=l 



N 

^X n (uj) ■ /(T x 'W+'-+ x »W,) • g (S n x) 
11=1 

N 



w N 

N ^ 

n=l 



n=l 

JV 



l — £> n ■ f(T x ^+-+ x ^x) ■ g(S n x) 

1 N 

m-^J2f( TXlH+ '" +XnHx ) 

n=l 
1 - 



n=l 

~ f(x) -g{x), 

where An(uj, x) « B^(lu, x) means that almost surely (the set of probability 1 is universal), the 
expression A^{uj,x) is asymptotically equal to B^{uj,x) for almost every x E X. The second 
comparison is the most crucial one; essentially one has to get good estimates for the L 2 norm of 
the averages ^ En=i( x «( w ) ~ °"n) • T Xl ^ + - +Xn ^ f ■ S n g. We do this in two steps. First we 
use an elementary estimate of van der Corput twice to get a bound that depends only on the 
random variables Y n , and then estimate the resulting expressions using the independence of the 
variables Y n . Let us also mention that the fifth comparison follows immediately by applying 
the first three for 5 = 1. 

2.2. A reduction. Our first goal is to reduce Theorem 11.11 to proving the following result: 

Proposition 2.1. Suppose that o~ n = n~ a for some a £ (0, 1/14) and let 7 > 1 be a real num- 
ber. Then almost surely the following holds: For every probability space (X,X,fi), commuting 
measure preserving transformations T,S: X — )■ X, and functions f,g€ L°°(p,), we have 



(8) 



00 

E 

k=l 



h 



-3- V Y n (u) ■ T x ^+-+ X ^f ■ S n g 



< +00. 



L 2 (u) 



We are going to establish this reduction in the next subsections. 

2.2.1. First step. We assume, as we may, that both functions |/| and \g\ are pointwise bounded 
by 1 for all points in X. By © for every oj £ £1 and x 6 X we have 

JV TV 

±_ £ f(T n x) ■ g(S a "^x) = j^Yl f(T Xl ^ + - +x ^)^x) ■ g(S an ^x). 



n=l 



n=l 
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A moment of reflection shows that for every bounded sequence (6n)neN ; for every w £ fi, the 
averages 

1 N 



jy £^ a n (w) 

n=l 

and the averages 



1 * 
— -r^- V A n (w) • b m 

where Wat(w) := Xi(oj) + ••• + Ajv(w), are asymptotically equal as N — > oo. Moreover, 
Lemma [5761 in the Appendix gives that almost surely limjv^oo Wn{w)/Wn = 1. Therefore, the 
last averages are asymptotically equal to the averages 

1 N 

n=l 

Putting these observations together, we see that for almost every uj G £1 the averages in (jSj) 
and the averages 

1 N 

(9) — V X„M • /(T ^H+-+^Hx) • 5 (S"x) 

Wat ' 

n=l 

are asymptotically equal for every ifl, 

2.2.2. Second step. Next, we study the limiting behavior of the averages © when the random 
variables X n are replaced by their mean. Namely, we study the averages 

1 N 

(10) V <r n ■ /(T*i(«)+-+*»M X ) • 5 (S"x). 

71=1 

By Lemma 15.31 in the Appendix, for every u G f2 and x G A they are asymptotically equal to 
the averages 

1 * 

(11) - f(T x ^ + -+ x ^x) ■ g(S n x). 

n=l 

Lemma 2.2. Suppose that o~ n = n~ a for some a G (0,1). Then almost surely the following 
holds: For every probability space (X, X, fi), measure preserving transformations T,S: X — > X, 
and functions f,g G L°°(fi), we have 

N N 
Krn^ (- Y f(T x ^+-+ x ^x) ■ g(S n x) - - £ f(r x ^+-+ x ^x) ■ E(g\l(S))(x)) = 

n=l n=l 

/or almost every x G A. 

Proof It suffices to show that almost surely, if E(^|X(5)) = 0, then lini7v->-oo A/v (/j 9, w, x) = 
for almost every x G A, where 

1 * 

Ajv(/, 5 , w, x) := - Y f(T x ^+-+ x ^x) ■ g(S n x). 

n=l 
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First we consider functions g of the form h — Sh where h G Assuming, as we may, 

that both |/| and \h\ are pointwise bounded by 1 for all points in X, partial summation gives 
that 



A N (f, h-Sh,u,x) = -j-J^ {f{T x ^+- +x ^x) - f(T x ^)+-+ x ™-^) x )) ■ h{S n x) + ojv(1). 

n=l 

The complex norm of the last expression is bounded by a constant times the average 

N 



1 N 



N 

n=l 

where E n := {w: X n (cu) = 1}. Since ¥(E n ) = n~ a , combining our assumption with Lemma [5 .5 
in the Appendix, we get that the last average converges almost surely to as N — > oo. There- 
fore, on a set £Iq of probability 1, that depends only on the random variables X n , we have 

(12) lim A N (f,h- Sh,u,x) = 

for almost every x £ X. 

Furthermore, using the trivial estimate 



1 N 

\A N (f,g,u,x)\<-Y,\9\( Sn x), 



71=1 



and then applying the pointwise ergodic theorem for the transformation S, we get for every 
oj £ f2 that 



(13) / ]xmsup\A N (f,g,u,-)\ d\i < \\g\\ L i 



Since every function g G L°°(^) that satisfies K(g\Z(S)) = can be approximated in L 1 (//) 
arbitrarily well by functions of the form h — Sh with h G L°° (//) , combining (|12|) and (|13|) , we 
get for every u G Qq, that if E(g|X(5)) = 0, then limTv^oo A]y(f, g, u, x) = for almost every 
x G X. This completes the proof. □ 

2.2.3. Third step. We next turn our attention to the study of the limiting behavior of the 
averages 

(14) ±J2f(T x ^+-+ x ^x). 

n=l 

Lemma 2.3. Let a n = n~ a for some a G (0, 1/14). Then almost surely the following holds: For 
every probability space (X,X,[i), measure preserving transformation T: X — > X, and function 
f G L°°(/i), the averages in (fT4l) converge to K(f\l(T))(x) for almost every x G X. 

Remark. Improving the range of the parameter a would not lead to corresponding improvements 
in our main results. On the other hand, the restricted range we used enables us to give a succinct 
proof using Proposition 12.11 
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Proof. We assume, as we may, that the function |/| is pointwise bounded by 1 for all points in 
X. First notice that by Lemma 15.31 in the Appendix, for every and x £ X, the averages 

in (fT4|l are asymptotically equal to the averages 



n=l 

where Wjv : = Yl n =l n <1 ~ X 1 ~ a . Combining this observation with Corollary 15.21 on the 
Appendix, we deduce that it suffices to show that almost surely the following holds: For 
every probability space (X,X,n), measure preserving transformation T: X — > X, function 
/ G L°°(fi), and 7 G {1 + 1/k, k G N}, we have 

I [7"] 

(15) Em — - J]) °n ■ f(T x ^+-+ x ^x) = E(/|J(T))(x) 

I I 1 n=l 

for almost every x £ X. 

Using Proposition 12.11 for g = 1, we get that almost surely (the set of probability 1 depends 
only on the random variables X n ), for every 7 G {1 + l/k,k £ N}, the averages in (I15j) are 
asymptotically equal to the averages 



h 



^T^ * ^ 



for almost every x £ X. Hence, it suffices to study the limiting behavior of the averages 

N 



-iVl n (,)./(^M+'" +I »M4 

W n * — ' 



n=l 



Repeating the argument used in Section P2.2.1I (with g = 1), we deduce that for every ui £ £1 
and x £ X, they are asymptotically equal to the averages 

N N 
_L £ f( TX 1 (.) + - + X an ^) x) = 1_J2 f(T n x) 

n=l n=l 

where the last equality follows from ([5]). Finally, using the pointwise ergodic theorem we get 
that the last averages converge to E(/|X(T))(x) for almost every x £ X. This completes the 
proof. □ 

2.2.4. Last step. We prove Theorem 11.11 by combining Proposition 12.11 with the arguments 
in the previous three steps. We start with Proposition 12. 1L It gives that there exists a set 
ilo G J 7 of probability 1 such that for every w G flo the following holds: For every probability 
space (X, X, fj,), commuting measure preserving transformations T, S: X — > X, functions /, g £ 
L°°{n), and 7 G {1 + 1/k, k £ N}, we have 



(16) ^ V]( w '') 



N=l 



2 

< +OO 

L 2 M 
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where 

1 N 

S N (u>,x) :=-ry„H • f(T x ^+- +x ^x)-g(S n x). 
Wn ^i 

In the remaining argument to is assumed to belong to the aforementioned set Qq. Notice that 
(fTBj) implies that 

lim S\ n](oj,x) = for almost every x G X. 

iV-5-oo 1 ' J 

We conclude that for almost every x £ X, for every j £ {1 + l/k,k £ N}, the difference 
*V] t^i W h N ] T^i 

converges to as N — > oo. In Sections 12.2.21 and 12.2.31 we proved that for almost every x £ X 
we have 

1 N 

lim tttY.^- f(T x ^+~+ x ^x) ■ g(S n x) = f(x) ■ g(x), 

where / := E(/|X(T)), and g := E(<7|X(S')). We deduce from the above that for almost every 
x £ X, for every j £ {1 + l/k, k £ N}, we have that 

h N ] 

lim w E X «H • f(T x ^+~+ x ^x) ■ g(S n x) = f(x) ■ g(x). 

Since the sequence (W n ) satisfies the assumptions of Corollary 1 5. 2 1 in the Appendix, we conclude 
that for non- negative functions f,g £ L°°(/u), for almost every x £ X, we have 

1 N 

(17) lim ^X n (u;) • f(T x ^+-+ x ^x) ■ g(S n x) = f(x) ■ g(x). 

n=l 

Splitting the real and imaginary part of the function / as a difference of two non-negative 
functions, doing the same for the function g, and using the linearity of the operator /—>•/, we 
deduce that (fT7|) holds for arbitrary /, g £ L°° (/i) . 

Lastly, combining the previous identity and the argument used in Section 12. 2. H we deduce 
that for almost every x £ X we have 



1 N 

J im m E ■ 9(S a ^x) = f{x) ■ g(x). 



n=l 

We have therefore established: 

Proposition 2.4. If Proposition [Ol holds, then Theorem \l.l\ holds. 
In the next subsection we prove Proposition 12.11 
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2.3. Proof of Proposition 12.11 The proof of Proposition 12.11 splits in two parts. First we 
estimate the I? norm of the averages p^— Xm=i Yn ' T Xl ^ ^ Xn f • S"g by an expression that is 
independent of the transformations T, S and the functions /, g. The main idea is to use van der 
Corput's Lemma (see Lemma [5.4l in the Appendix) enough times to get the desired cancelation, 
allowing enough flexibility on the parameters involved to ensure that certain terms become 
negligible. Subsequently, using moment estimates, we show that the resulting expression is 
almost surely summable along exponentially growing sequences of integers. 

Before delving into the details we make some preparatory remarks that will help us ease our 
notation. We assume that both functions /, g are bounded by 1. We remind the reader that 

t1— a 



n 



W N ~ N 1 



for some a 6 (0, 1). We are going to use parameters M and R that satisfy 

M=[N b ], R=[N C ] 

for some b, c S (0, 1) at our disposal. We impose more restrictions on a, b, c as we move on. 

2.3.1. Eliminating the dependence on the transformations and the functions. To simplify our 
notation, in this subsection, when we write Yln=i we mean Yl\^=i ■ 

Using Lemma IO in the Appendix with M = [N b ] and v n = Y n ■ T Xl+ '" +Xn f ■ S n g, we get 
that 

2 



(18) 
where 



A 



N 



N 



AT 



N 



-1+ 



aS ^Y n -T Xl+ - +Xn f ■ S r 



n=l 



< Aim + A 



2JV, 



L 2 M 



N 



N' 



-l+2a-b 



■^2\\Y n ■ T Xl+ - +Xn f ■ S r - 



n=l 



\L*(j*) 



and 

A2,N 



N b N- 



N~ 



-l+2«-6 



E | E / Y n+m -Y n -T x ^--- +x ^f-S n+m g-T x ^--- +x -f-S n gd l i . 

m=l n=l 

We estimate Ai^. Since E(Y^) = a n — a 2 ~ n~ a , Lemma [5.61 in the Appendix gives for 
every a 6 (0, 1) that £„=i F n 2 ~ J2n=i e ( y n) ~ Therefore, almost surely we have 



N 



Ai, N « N- 1+2a - b Yn <u N- 1 



+2a—b _ ^yl — a 



N 



a—b 



n=l 



It follows that Ai^n is bounded by a negative power of N as long as 

b > a. 



Next, we estimate A2,n- We compose with S n and use the Cauchy-Schwarz inequality. We 



get 



A' 6 



Ao.n < N' 



-1+2 



m=l 



N-m 



E Y ■ Y ■ s~ n T Xl ~^ — \-x n + m r _ S~ n T XlJ> — ^ Xn f 



n=l 
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Furthermore, since 



N b 



N 



-l+2o-6 



^ m < N 



-l+2a+b 



m=l 



we get the estimate 



A 2 .n < AT dl + N 



-l+2o-6 



N 

E 

m=l 



2V 



Ey ■ y • S~ n T XlJ[ ^-Xn+m f . o— 'Vj^ 1 ^ f 



n=l 



L 2 M 



where d\ := 1 — 2a — b is positive as long as 

2a + b< 1. 

Using the Cauchy-Schwarz inequality we get 



-2+4a-b 



AT 6 

E 

m=l 



A? 



E Y . y S~ n T^ 1 ~ lt — \~Xn+m f . Q~ n T^ 1 ~^ — ^ n f 



^2,JV < N ~ 2dl + N _ _ 

M L2( M ) 

Next we use Lemma 15.41 in the Appendix with = [iV c ] and the obvious choice of functions 
v n , in order to estimate the square of the L 2 norm above. We get the estimate 

A\ N < N- 2dl + A 3iN + A 4>N , 

where A^^, A^jv, can be computed as before. Using Lemma [5771 in the Appendix, and the 
estimate K(Y 2 ) ~ n~~ a , we deduce that almost surely, for every a G (0, 1/6) we have 

N b N 



-l+4a-&- 



^1 Ei+mEi A r 



2a-c 



-da 



m=l n=l 



A^ N < AT 

where d 2 > as long as 

2a < c. 

Composing with T~^ Xl ^ VX n )gn^ us j n g that T and S commute, and the Cauchy-Schwarz 
inequality, we see that 



A 4;N < A" 



-l+4a-6-c 



' a + nn+r ' Ei+r ' Ei+m ' El' 



Affc i\r c N- 

EE IE* 

m=l r=l n=l 

y-^n+lH h-^n+m+r f f . T^n+lH h-XVi+r q~ r f _ ^X n+ iH hA n+rl 



Since for every k G N we have + • • • + A" n+ fc G {0, . . . , k}, it follows that 



L 2 (n) 



(19) A A>N < 4s )JV := AT 



A 6 A c m+r r m iV-r 

• e E E E E I E ^ 



i+m+r ' Ei+r ' Ei+m ' El ' 

m=l r=l fci=0 f:2=0 ^3=0 n=l 

Summarizing, we have just shown that as long as 

(20) a<b, 2a + b<l, 2a < c, a G (0,1/6), &,c€(0,l), 



n 
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almost surely the following holds: For every probability space (X,X,/j,), commuting measure 
preserving transformations T, S : X — > X, and functions f,g G L 00 ^) with H/H^oo^) < 1 and 
< 1, we have 

(21) A N « w N- d > + A 5tN 

for some d% > 0, where A^n is defined in ([19]) . Notice that the expression As t N depends only 
on the random variables X n . Therefore, in order to complete the proof of Proposition 12.11 it 
suffices to show that almost surely A5 jv is summable along exponentially growing sequences of 
integers. 

2.3.2. Estimating A^n (End of proof of Proposition \2.1\) . Assuming that 

(22) b < c, 
we get that 



N b N c 2N C N c N b 

(23) eua 5 , n ) < N-^- b - c E E E E E E - 

TO =1 r=l ki=0 k 2 =0 k :i =0 

where 



N-r 



k2M 



n=l 



Z n ,m,rMMM '— Y n+m+r ' Y n +r • Y n+m ■ l^m+r X„ +fc =fci ( n ) " 1 Efc=i X n+k =k 2 ( n ) ' 1 Er=i X n+k =k [i { n )- 
Using the Cauchy-Schwarz inequality we get 

2\ 1/2 



(24) 



N-r 



^ ^ Yn ' ^n,m,r,ki, 



n=l 



< E„ 



N-r 



^ ^ Yn ' ^n,m,r,ki 



n=l 



We expand the square in order to compute its expectation. It is equal to 

^ ^u}(Y ni • ^m,m,r,fci,fc2,A!3 ' Yn2 ' ^n2,m,r,ki,k2,k3^ ■ 

l<ni,ri2<A r — r 

Notice that if n\ < n2, then for every m, r G N, and non-negative integers k±, fe, k^, the random 
variable is independent of the variables Y n2 , 2^ m ,r,fci,jfc 3 ,fc3> an d -^n 2 ,m,r,fci,fc2,fc 3 - Since Y n 
has zero mean, it follows that if ni 7^ rt2, then 

^(^ni ' ^ni,m,r,fci,A;2,fc3 " Y n2 ■ ^7i2,m,r,fci,fc2,fc3) — ^ - 

Therefore, the right hand side of equation (I24p is equal to 

/N-r \ 1/2 /A r_ r v 1/2 



E E "C^n) ' ^^(^n,m,r,ki 



k2,ksi 



— \ ^oj(Y n ) • E w (l^ +m • Y n+r ■ Y n 



+m+r I 



v n=l 



v ra=l 



If r, m, n are fixed and r ^ m, then the variables Y^ +m , Y^ +r ,Y^ +m+r are independent, and as 
a consequence the right hand side is almost surely bounded by 

1/2 

l/2-2a 



' N \ L I* 
vn=l / 
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On the other hand, if r,m,n are fixed and r = m, then the random variables Y^ +r ,Y^ +2r are 
independent, and as a consequence the right hand side is almost surely bounded by 

/ N \ 1 / 2 

(l>nj «iV 1/2 - 3a/2 . 
Combining these two estimates with (|23p . we deduce that 

E w M g N \ <g ]y-l+-±a-b-c^- N l/2-2a+2b+3c +N l/2-3a/2+2b+2c^ _ ^-l/2+2a+b+2c _|_^y-l/2+5a/2+b+e 

For fixed e > 0, letting a £ (0, 1/6), b be greater and very close to a, and c be greater and very 
close to 2a, we get that the conditions ([20]) and ([22]) are satisfied, and 



(25) E w (A 5iA r) < AT(-l+14a)/2+ £ + #(-1+11(0/2+6 = #-d 4 

for some that satisfies 

(26) d 4 > (l-14a)/2-e. 

Therefore, for every a G (0, 1/14), if e is small enough, then the estimates (|2ip and ()25[) hold 
for some ^3,^4 > 0. 

Equation (|25p gives that for every 7 > 1 we have 

00 
iV=l 

As a consequence, for every 7 > 1 we have almost surely that 

00 

(27) ^2A 5hN] (uj) <+oo. 

N=l 

Recalling the definition of An in f|18|) . and combining ()21 j) and ()27|) . we get that for every a G 
(0,1/14) and 7 > 1, almost surely the following holds: For every probability space (X, X, //), 
commuting measure preserving transformations T, S: X — > X, and f,g€ L°°(fi), we have 



Yl ||V]( w >-) 

JV=1 

where 



2 

< +OO 

L2( M ) 



71=1 



This finishes the proof of Proposition 12, li 



3. Convergence for the same random iterates 

In this section we prove Theorem 11.41 Throughout, we use the notation introduced in 
Section 11.2.11 and the beginning of Section 12.21 
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3.1. Strategy of the proof. In order to prove Theorem 11.41 we go through the following 
successive comparisons: 



1 * 

_Y j f(T a ^x)-g(S a "^c 



n=l 



^Y,X n {u).f{T n x).g{S n u 

n=l 
1 N 

1 N 

Y,f(T n x)-g(S n x), 



n=l 



where our notation was explained in Section 12.11 The key comparison is the second. One 
needs to get good estimates for the I? norm of the averages X^Li ^n( w ) " T n f ■ S n g, 
where Y n := X n — a n . We do this in two steps. First we use van der Corput's estimate 
and Herglotz's theorem to get a bound that depends only on the random variables Y n . The 
resulting expressions turn out to be random trigonometric polynomials that can be estimated 
using classical techniques @ 

3.2. A reduction. Arguing as in Section 12.21 (in fact the argument is much simpler in the 
current case) we reduce Theorem 11.41 to proving the following result: 

Proposition 3.1. Suppose that o~ n = n~ a for some a E (0, 1/2) and let 7 > 1 be a real num- 
ber. Then almost surely the following holds: For every probability space (X,X,/j,), commuting 
measure preserving transformations T, S : X — > X, and functions f,g € L°°(jx), we have 

2 

I 1 " ' 



(28) 



E 

k=l 



w bk - 



n=l 



id 



T n f ■ S n g 



< +00 



Z^n=l °Vi 



where Wn 

We prove this result in the next subsection. 

3.3. Proof of Proposition 13.11 As was the case with the proof of Proposition 12 . 1 1 the proof 
of Proposition 13.11 splits in two parts. 

3.3.1. Eliminating the dependence on the transformations and the functions. We assume that 
both functions f,g are bounded by 1. We start by using Lemma 15.41 for M = N and v n '.= 
Y n ■ T n f ■ S n g (this is essentially the ordinary expansion of the square of the sum). We get that 

2 

-l+a 



(29) 
where 



A 



N 



N~ 



ha Y J Yn-T n f.S n g 



n=l 



< A hN + A 2 , N 



L2( M ) 



N 



A 



l,N 



N' 



-2+2, 



a .J2\\ Y n-T n f-S n g\\ 2 L2 



n=l 



5 A faster way to get such an estimate is to apply van der Corput's Lemma twice. The drawback of this 
method is that the resulting expression converges to zero only when a n — n~ a for some a £ (0, 1/4). 
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and 



±2,N • = 



N 



-2+2a 



N 

E 

m=l 



N-m r 

/ Y n+m ■ Yn • T n+m f ■ S n+m g ■ T n f ■ S n g dfx 

n=l •* 



We estimate A\^. Since E^(Y„) ~ n a , Lemma [5T6l gives 

En=lYn ^ N 1 ' . It follows 



that almost surely we have 
(30) A 1>N « N- 2+2a Yn «w N- 2+2a • iV 1 -" 



v 



N 



a-l 



n=l 



Therefore, Ai t N is bounded by a negative power of N for every a E (0, 1). 

We estimate A2 n- Composing with S~ n and using the Cauchy-Schwarz inequality we get 



A 2 n < N 



-2+2a 



N 

E 

m=l 



N-m 



£ Yn+m • Y n ■ S - n T n+m f ■ S~ n T n f 



n=l 



Using that T and S commute and letting R = TS 1 and f m = T m f • /, we rewrite the previous 
estimate as 



N 



A 2 ,n < N' 



-2+2 



°E 

m=l 



N-m 



^ Y n +m ' Y n • R f n 



n=l 



Using Herglotz theorem on positive definite sequences, and the fact that the functions f m are 
bounded by 1, we get that the right hand side is bounded by a constant multiple of 

N-m 

A3 n '■= N~ l+2a ■ max max > Y n+m ■ Y n ■ e(nt) . 

l<m<JVte[0,l] I ^ 
n=l 

Summarizing, we have shown that 
(31) A N ^ N " 1 + A 3>N . 

Therefore, in order to prove Proposition ^. H it remains to show that almost surely A^n <Cw iV 
for some d > 0. We do this in the next subsection. 

3.3.2. Estimating A 3) n (End of proof of Proposition HO]) . The goal of this section is to prove 
the following result: 

Proposition 3.2. Suppose that a n ~ n~ a for some a S (0, 1/2). Then almost surely we have 

N-m 

max max I V Y n+m ■ Y n ■ e(nt) N 1/2 - a ^logN, 

l<m<N te\0,l] I ^ 
n=l 

Notice that by combining this estimate with (I3ip we get a proof of Proposition I3.1|. and as 
a consequence a proof of Theorem 11.41 

The key ingredient in the proof of Proposition l3.2l is the following lemma. It is a strengthening 
of an estimate of Bourgain [7] regarding random trigonometric polynomials. The proof of the 
lemma is a variation on the classical Chernoff's inequality (see e.g. Theorem 1.8 in |28j), 
combined with an elementary estimate on the uniform norm of a trigonometric polynomial. 
We were motivated to use this argument, over the one given in [7j, after reading a paper of Fan 
and Schneider (in particular, the proof of Theorem 6.4 in |14j). 



max max 

l<m<iVte[0,l] 



A 



N 



logN-^2p n 



n=l 
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Lemma 3.3. Let (^ m ,n)m,ngN be a family of random variables, uniformly bounded by 1, and 
with mean zero. Suppose that, for each fixed m, the random variables Z m ^ n ,n > 1, are inde- 
pendent. Let (p n ) be a sequence of positive numbers such that 

1 N 

sup (Var(Z mn )) < p n and lim — } p n = +oo. 

men JV-Kx>logJV ^ 

Then, almost surely, we have 

N 

Z m ,n ■ e(nt) 

n=l 

Proof. It suffices to get the announced estimate for 

Mjv := max max |P m jv(i)| 
l<m<JVte[0,l] ' 

where 

N 

Pm,N(t) ■= ^m,n " COS (27m*). 
n=l 

In a similar way we get an estimate with sin(27rni) in place of cos(2-7rnt). 

Since |Z m , n | < 1 and E u (Z min ) = 0, we have E w ( e AZ ™.«) < e A 2 Var(z m ,„) for all A £ ^ 1 ]_ 
(See Lemma 1.7 in |28].) 

For every m G N, AG [—1,1], and t G [0, 1], we get that 

N N 
(32) E w ^ e AP '"^(')^ = E w ^ e AZm ." cos(27rnt)^ < e (Acos(27rnt)) 2 Var(Z,„ : „) < e A 2 _Rjv 

n=l n=l 

where 

iV 

i?A? := cr n . 

n=l 

Next notice that for AG [0,1] we have 
(33) 

AT 

E aj (e AMjv ) =EJ max e Amaxt l p m,w(t)l) < e w f e Amax < I^W^ < N max Eje AM ™-^) 

v l<m<N ' \ ^ J l<m<N 

m=l 

where 

M mN := max \P m ,N(t)\. 
te[o,i] 

It is easy to see (e.g. Proposition 5 in Chapter 5, Section 2 of [IH]) that there exist random 
intervals I m ,jv of length |/ mi jv| > -A r_2 such that |P m ,jv(£)| > M m ^ /2 for every i G Using 
this, we get that 



E w (e AjvMm > Jv/2 ) < iV 2 • E w ( / ( e x NPm, N (t) + e -X N P m , N (t)^ dt \ < 

V -Ao,H ' 
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where A at S [0, 1] are numbers at our disposal. Using (j32j) we get that 

K>( / (e XNPm ' N{t) + e ~ XltPm > lt ®) dt \ _ / Euj ^ e x N P m ,N(t) + e -X N P m , N (t)\ dt < 2e RNX N. 

V ^[0,l] J V[Q,1] 

Therefore, 

Combining this estimate with ([33]) , we get 

Ea,(e AjvMiv/2 ) < N 3 ■ e RNX N. 
Therefore, there exists a universal constant C such that 



As a consequence, 

(34) P(Mjv > 2R N X N + 2log(CN 5 )\ N 1 ) < jL. 

For a, (3 positive, the function /(A) = aA + /3A _1 achieves a minimum for A = \J (3/a. So 
letting A at = -y/log^iV 5 ) / (ARaO (by assumption A at converges to 0, so Xjy < 1 for large N) 
in (j34"|) gives 

p(Mat > y/AR N ]og(CN^ < ^. 
By the Borel-Cantelli Lemma, we get almost surely that 

M N < w ^R N logN. 

This completes the proof. □ 
Finally we use Lemma 13.31 to prove Proposition 13.21 



Proof of Proposition \3.Sk Our goal is to apply Lemma 13.31 for the random variables Y n+m ■ Y n 
where Y n = X n — a n . These random variables are bounded by 1 and have zero mean. We just 
have to take some care because they are not independent. We divide the positive integers into 
two classes: 

Ai,m := {n: 2km < n < (2k + l)m for some non-negative integer k} 

and 

A-2,m '■= { n: (2k + l)m < n < (2k + 2)m for some non-negative integer k}. 

Then for fixed m € N, the random variables Y n+m ■ Y n , n G Ai )m , are independent, and the 
same holds for the random variables Y n+m ■ Y n , n 6 h-2,m- For i = 1, 2, we apply Lemma 13.31 to 
the random variables 

Z m ,n '■= Y n+m ■ Y n • lA iim n[l,A r -m]( 77 ')- 

Notice that either Var(Z. mi „) = 0, or 

\T t ry \ _ 2 2,2 2^ ^2 -2a 

\/ax{zj m ^ n ) — a n+m a n — o- n+m a n — a n+m a n + o- n+m a n s o- n+m a n s <? n ~ n 
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Since ^n=i n_2a ~ N l ~ 2a and a < 1/2, the assumptions of Lemma 13.31 are satisfied for 
p n = n~ 2a . We deduce that almost surely we have 

N-m 

max max I V Y n+m ■ Y n ■ e(nt) < w N 1/2 ~ a A/log N. 
l<m<iVie[o,i] I ^-J 

n=l 

This completes the proof. □ 

4. NON-RECURRENCE AND NON-CONVERGENCE. 
In this section we prove Theorem 11.71 The proof is based on the following lemma: 

Lemma 4.1. Let a,b: N — > Z\{0} be injective sequences and F be any subset o/N. Then there 
exist a probability space (X,X,[j,), measure preserving transformations T, S: X — > X, both of 
them Bernoulli, and A € X , such that 



' T -a(n) A n S ~b(n) A j 



if n G F, 
if n £ F. 



Proof. We are going to combine a construction of Berend (Ex 7.1 in [2]) with a construction of 
Furstenberg (page 40 in [16]). 

Suppose first that the range of both sequences misses infinitely many integers. Let X = 
{0, 1} Z , \x be the (1/2, 1/2) Bernoulli measure on X, and T be the shift transformation. Given a 
permutation ir of Z with 7r(0) =0 we define the measure preserving transformation ^ : X — > X 

by 




if n = 0, 
if 



Let 5 = tp n Tty-n (S is also Bernoulli). Since (ip w ) 1 = ip n -i and (Vv- l2; )o = ^o> for n £ N we 
have 

(S' n x) = (lp 7T -iT n Tp n x) = (T n ^ 7r x) = (V>7r£)n = 1 - 

Hence, if A = {x G X : x(0) = 1} we have 

T -a(n) A n s -6(n)^ = {x G ^ . = 1? ^^^^ = 0} _ 

Finally, we make an appropriate choice for ir. Since the sequences (a(n)) and (b(n)) are injective 
and miss infinitely many integers, there exists a permutation ir of the integers that fixes and 
satisfies 7r(6(n)) = a{n) if n 6 F, and 7r(6(n)) 7^ a(n) \i n £ F. Then 

Kr°Wins- 6 Wi) = |° ( if neF, 

PV ; [1/4 if n ^ F. 

We now consider the general case. Notice that the range of the sequences (2a(n)) and (2b(n)) 
misses infinitely many values. We consider the transformations T 2 and S 2 in place of T and 
S (again they are Bernoulli) and carry out the previous argument with a permutation tt that 
satisfies vr(26(n)) = 2a(n) if n £ F and vr(26(n)) ^ 2a(n) iin^F. □ 
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Corollary 4.2. Let a, b: N —> Z \ {0} be injective sequences, and c: N — > [0,1/4] be any 
sequence. Then there exist a probability space (X,X,/j>), measure preserving transformations 
T,S: X — > X, and A E X , such that for every n E N one has 

c(n) = fi(T- a ^A n S-^ n) A) . 

Proof. The set S consisting of all sequences that take values on a set [0, a], where a > 0, is 
a compact (with the topology of pointwise convergence) convex subset of the locally convex 
space that consists of all bounded sequences. The extreme points of S are the sequences that 
take values in the set {0,a}. The set ext(S), of extreme points of S, is closed, hence, by the 
theorem of Krein-Milman, every element in S is the barycenter of a Borel probability measure 
on ext(S). As a consequence we get that given any sequence c: N — > [0, 1/4], there exists a 
Borel probability measure a on a compact metric space (Y, d), and sequences c y : N — > {0, 1/4}, 
y E Y, such that for every n E N one has 

c(n) = J c y (n) da(y). 

Looking at the proof of Lemma 14.11 we see that there exist measure preserving transformations 
T y and 5^, acting on the same probability space (X, X, fx), and A E X, such that for every 
y E Y and n E N one has 

c y (n) = n{T~ a ^A n Sy b ^A). 

On the space (Y x X, By x X , a x [i) we define the measure preserving transformations T, S : Fx 
X — > Y x X by the formula T(y, x) = (y, T y (x)) and S(y, x) = (y, S y (x)). Then for every n E N 
one has 

^(r a WinS 4(n) A) = J n(T- a ^Ar\Sy- b(n) A) da{y) = J c y (n) da{y) = c{n). 

□ 

Proof of Theorem \1. 7[ For non-convergence take F = U neN [2 2n , 2 2n+1 ] in Lemma O and de- 
fine f = g = 1a- For non-recurrence take F = N in Lemma 14. 11 □ 



5. Appendix 

We prove some results that were used in the main part of the article. 

5.1. Lacunary subsequence trick. We are going to give a variation of a trick that is often 
used to prove convergence results for averages (see [26] for several such instances). 

Lemma 5.1. Let (a n ) ne ^ be a sequence of non-negative real numbers and (W n ) n gN oe an 
increasing sequence of positive real numbers that satisfies 

lim limsup — — - — = 1. 

T — n— >oo yv n 



For N EN let 
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Suppose that there exists L G [0, +00], and a sequence of real numbers 7& G (l,+oo) ; with 
7fc — >• 1, and smc/i i/iai /or every G N we /ioue 



Then 



lim A^jvi = L. 



lim ^4jv = L. 



Proof. Fix A; G N and for N G N let M = M(k, N) be a non-negative integer such that 
Since a n > for every n G N and W n is increasing, we have 

1 N l [7 fc M+1 ] 

^ = ^E a «^-^— E a »^Hf +i i where ^M-^M+ij/^Mj. 

71=1 Uh \ n=l 

Similarly we have 

A N > c fc,M^[ 7 f]- 

Putting the previous estimates together we get 

(35) c fc,M A [ 7 f ] < A N< C k)M \M+iy 

Notice that our assumptions give that 

(36) lim lim sup Ck,M = 1- 

fe-s>oo m^oo 

Since M = M(k, N) — > 00 as N — > 00 and k is fixed, letting N — > 00 and then k — > 00 in (j35H . 
and combining equation (j36|) with our assumption lim7v->oo ^[7^] = ^> we deduce that 

liminf An = lim sup A n = L. 

N^oo ' N^oo 

This completes the proof. □ 

Corollary 5.2. Let (X,X,fj.) be a probability space, f n : X — > R, n G N, 6e non-negative 
measurable functions, (W n )neN be as in the previous lemma, and for N G N let 

1 N 

An{x) '- = wZ^ fn{x) - 

n=l 

Suppose that there exists a function f : X — > R and a sequence of real numbers 7^ G (l,oo), 
with 7fc — > 1, and such that for every k G N we have for almost every x G X that 

(37) lim A [ M ] (x) = f{x). 
Then 

lim Av(x) = /(x) for almost every x G X. 

N—>oo 



Proof. It suffices to notice that for almost every x G X equation (|37j) is satisfied for every 
k G N, and then apply Lemma 15. 11 □ 



RANDOM SEQUENCES AND POINTWISE CONVERGENCE OF MULTIPLE ERGODIC AVERAGES 23 

5.2. Weighted averages. The following lemma is classical and can be proved using summa- 
tion by parts (also the assumptions on the weights w n can be weakened). 

Lemma 5.3. Let (i> n )neN be a sequence of vectors in a normed space, (w n ) n ^fq be a decreasing 
sequence of positive real numbers that satisfies w n ~ n~ a for some a £ (0, 1), and for N 6 N 

let Wn '■= W\ H + wn- Then the averages ^2^=1 v n an d the averages Yl n =i w n v n Q- r e 

asymptotically equal. 

5.3. Van der Corput's lemma. We state a variation of a classical elementary estimate of 
van der Corput. 

Lemma 5.4. Let V be an inner product space, N £ N ; and v±, . . . ,vn £ V. Then for every 
integer M between 1 and N we have 

2 



N 
n=l 



N M N-m 

< 2ikr 1 iV-^|K|| 2 + 4M- 1 iV^ I ^ <v n +m,v n > 

n=l m=l n=l 



In the case where V = M and ||-|| = | • |, the proof can be found, for example, in |20j. The 
proof in the general case is essentially identical. 

5.4. Borel-Cantelli in density. We are going to use the following Borel-Cantelli type lemma: 

Lemma 5.5. Let E n , n 6 N, be events on a probability space (tt, J 7 , P) that satisfy P(EL) <C 
(logn) _1 ~ £ for some e > 0. Then almost surely the set {n £ N: oj € E n } has zero density]^ 



Proof. Let 

Our assumption gives 
Therefore, for every 7 > 1 



A N {u) := 

n=l 

E w (A n (cj)) « (log N)~ 1 ~ £ . 
00 

AfrN](u)) < +00 



N=l 

almost surely. This implies that for every 7 > 1 

lim A< n](ui) = almost surely. 

Since 7 > 1 is arbitrary we conclude by Corollary 15.21 that 

lim An(oj) = almost surely. 

N^oo 

This proves the advertised claim. □ 



On the other hand, it is not hard to construct a probability space (fi, T, P) and events E n , n £ N, such that 
^(En) < (logn) -1 , and almost surely the set{n£N:u£ E n } has positive upper density. 
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5.5. Estimates for sums of random variables. We use some straightforward moment es- 
timates to get two bounds for sums of independent random variables that were used in the 
proofs. 

Lemma 5.6. Let X n be non-negative, uniformly bounded random variables, withE ul (X n ) ~ n~ a 
for some a G (0,1). Suppose that the random variables X n — E w (X n ) are orthogonal. Then 
almost surely we have 

1 N 
lim — — ^X n = 1, 



n=l 



where, as usual, Wn '■= ^^^(Xti). 

Remark. Assuming independence, one can use Kolmogorov's three series theorem to show that 
the stated result holds under the relaxed assumption Wn — > oo. 

Proof. We can assume that X n {oS) < 1 for every oo G 0, and n € N. We let 

1 N 

A » -^£^ 

where 



: — X n — E^(X n ). 

Since are zero mean orthogonal random variables and E W (Y^) < W, u (X n ), we have 

N 



JV n=l 

Combining this estimate with the fact Wn ~ A fl ~ a , we conclude that for every 7 > 1 we have 



E w < + x. 



N=l 



Therefore, for every 7 > 1 we have 



lim Ar N] = almost surely, 

N-toc 11 1 

or equivalently, that 

1 h N ] 

lim — > X n = 1 almost surely. 

N ^°° W h N ] £1 

Since the sequence (W„) ng N satisfies the assumptions of Corollary 15.21 and X n is non-negative, 
we conclude that 

1 N 

lim — — > X n = 1 almost surely. 

n=l 

This completes the proof. □ 
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Lemma 5.7. Let X n be independent, uniformly bounded random variables, withE, UJ (X n ) ~ n~ a 
for some a G (0, 1/6), and let b be any positive real number. Then almost surely we have 

N b N 



'^2'^2x n + m X n <C W N b+1 2a . 



m=l n=l 



Using a lacunary subsequence trick, similar to the one used in the proof of Lemma 15.61 one 
can show that the conclusion actually holds for every a G (0, 1/4). 



Proof. Let 

and 
Since 



N b N 

Sn '■= {X n +mX n — K u (X n+m ) • E w (X n )) 



m=l n=l 



A N := N C S N where c := b + 1 - 2a, 



N b N 

n- c j2J2 E ^ Xn +^ • E -( x «) ^ 

m=l n=l 

it suffices to show that almost surely we have liniAr^oo An = 0. 

Expanding S N and using the independence of the random variables X n , we see that 

EwOSjv) < |{(m,m',n,n' ) E [l,iV b ] 2 x [1,JV] 2 : n,n ,n+m,n +m' are not distinct }| < iV 1+26 . 
Therefore, 

It follows that if k G N satisfies fe(l — 4a) > 1, then 

oo 
N=l 

As a consequence, 

lim A N k = for every k G N satisfying /c(l — 4a) > 1. 

N-^oo 

For any such A; G N, and for a given iV G N, let M G N be an integer such that M k < N < 
(M + l) k . Then 

\A N -A Mk \ < \{N~ c M kc -l)A Mk \+N~ c Y, 

M k <n<(M+l) k 

< \ (N~ c M kc - l)A Mk \ +N~ c M k - 1 . 

The first term converges almost surely to zero as N — > oo, since this is the case for A M k and 
N~ 1 M k < 1. The second term converges to zero if kc > k— 1, or equivalently, if k(2a — b) < 1. 

Combining the above estimates, we get almost surely that limTv-i-oo An = 0, provided that 
there exists k G N such that k(2a — b) < 1 < k(l — 4a). If a < 1/6, then k = 3 is such a value. 
This completes the proof. □ 
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